Average cost Markov control processes with weighted norms: value iteration
نویسندگان
چکیده
منابع مشابه
E. GORDIENKO and O. HERNÁNDEZ-LERMA (México) AVERAGE COST MARKOV CONTROL PROCESSES WITH WEIGHTED NORMS: VALUE ITERATION
This paper shows the convergence of the value iteration (or successive approximations) algorithm for average cost (AC) Markov control processes on Borel spaces, with possibly unbounded cost, under appropriate hypotheses on weighted norms for the cost function and the transition law. It is also shown that the aforementioned convergence implies strong forms of AC-optimality and the existence of f...
متن کاملE. GORDIENKO and O. HERNÁNDEZ-LERMA (México) AVERAGE COST MARKOV CONTROL PROCESSES WITH WEIGHTED NORMS: EXISTENCE OF CANONICAL POLICIES
This paper considers discrete-time Markov control processes on Borel spaces, with possibly unbounded costs, and the long run average cost (AC) criterion. Under appropriate hypotheses on weighted norms for the cost function and the transition law, the existence of solutions to the average cost optimality inequality and the average cost optimality equation are shown, which in turn yield the exist...
متن کاملl AVERAGE COST SEMI - MARKOV DECISION PROCESSES
^ The Semi-Markov Decision model is considered under the criterion of long-run average cost. A new criterion, which for any policy considers the limit of the expected cost Incurred during the first n transitions divided by the expected length of the first n transitions, is considered. Conditions guaranteeing that an optimal stationary (nonrandomized) policy exist are then presented. It is also ...
متن کاملInteractive Value Iteration for Markov Decision Processes with Unknown Rewards
To tackle the potentially hard task of defining the reward function in a Markov Decision Process, we propose a new approach, based on Value Iteration, which interweaves the elicitation and optimization phases. We assume that rewards whose numeric values are unknown can only be ordered, and that a tutor is present to help comparing sequences of rewards. We first show how the set of possible rewa...
متن کاملA Simulation-Based Policy Iteration Algorithm for Average Cost Unichain Markov Decision Processes
In this paper, we propose a simulation-based policy iteration algorithm on Markov decision process (MDP) problems with average cost criterion under the unichain assumption, which is a weaker assumption than found in previous work. In this algorithm, 1) the problem is converted to a stochastic shortest path problem and a reference state can be chosen as any recurrent state under the current poli...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applicationes Mathematicae
سال: 1995
ISSN: 1233-7234,1730-6280
DOI: 10.4064/am-23-2-219-237